Leveraging History for Faster Sampling of Online Social Networks

نویسندگان

  • Zhuojie Zhou
  • Nan Zhang
  • Gautam Das
چکیده

With a vast amount of data available on online social networks, how to enable efficient analytics has been an increasingly important research problem. Many existing studies resort to sampling techniques that draw random nodes from an online social network through its restrictive web/API interface. While almost all of these techniques use the exact same underlying technique of random walk a Markov Chain Monte Carlo based method that iteratively transits from one node to its random neighbor. Random walk fits naturally with this problem because, for most online social networks, the only query we can issue through the interface is to retrieve the neighbors of a given node (i.e., no access to the full graph topology). A problem with random walks, however, is the “burn-in” period which requires a large number of transitions/queries before the sampling distribution converges to a stationary value that enables the drawing of samples in a statistically valid manner. In this paper, we consider a novel problem of speeding up the fundamental design of random walks (i.e., reducing the number of queries it requires) without changing the stationary distribution it achieves thereby enabling a more efficient “drop-in” replacement for existing sampling-based analytics techniques over online social networks. Technically, our main idea is to leverage the history of random walks to construct a higher-ordered Markov chain. We develop two algorithms, Circulated Neighbors and Groupby Neighbors Random Walk (CNRW and GNRW) and rigidly prove that, no matter what the social network topology is, CNRW and GNRW offer better efficiency than baseline random walks while achieving the same stationary distribution. We demonstrate through extensive experiments on real-world social networks and synthetic graphs the superiority of our techniques over the existing ones.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Interpersonal Trust in Online Scientific Social Networks: Causes and Results

Background and Aim: This study tends to investigate the reasons of interpersonal trust and the results of trust in online scientific social networks. Methods: The applied Research has been used cluster sampling to collect data. The study population consisted of Shiraz university and Persian Gulf university faculties. A sampling of 269 person was determined by Morgan table according to whole pop...

متن کامل

Walk, Not Wait: Faster Sampling Over Online Social Networks

In this paper, we introduce a novel, general purpose, technique for faster sampling of nodes over an online social network. Specifically, unlike traditional random walks which wait for the convergence of sampling distribution to a predetermined target distribution a waiting process that incurs a high query cost we develop WALK-ESTIMATE, which starts with a much shorter random walk, and then pro...

متن کامل

Proposing a model for entrepreneurship opportunities and challenges in online social networks in Iran

Human life has been affected by new communications in recent years. The development of cyberspace has led to businesses embracing social networks. The appearance of the cyberspace and features offered by information technology (IT) has provided hope, wishes, opportunities, and challenges for business owners. Organizations create opportunities for improving productivity, market share and value, ...

متن کامل

Relationship between the Online Social Networks Addiction and Psychological Disorders

Background: The Online social networks addiction like others type of addiction can lead to ethical dilemmas, as well as it can be affected from psychological disorders. So, the aim of this research is to analyze the effect of depression, anxiety and usage time of online social networks on the level of online social networks addiction and on the life satisfaction. Method: The method of research ...

متن کامل

Analysis and Evaluation of Privacy Protection Behavior and Information Disclosure Concerns in Online Social Networks

Online Social Networks (OSN) becomes the largest infrastructure for social interactions like: making relationship, sharing personal experiences and service delivery. Nowadays social networks have been widely welcomed by people. Most of the researches about managing privacy protection within social networks sites (SNS), observes users as owner of their information. However, individuals cannot co...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • PVLDB

دوره 8  شماره 

صفحات  -

تاریخ انتشار 2015